-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not close FDs 0, 1, or 2 #186
base: main
Are you sure you want to change the base?
Conversation
29a0392
to
e123ed7
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #186 +/- ##
==========================================
+ Coverage 78.84% 78.96% +0.11%
==========================================
Files 55 56 +1
Lines 10146 10183 +37
==========================================
+ Hits 8000 8041 +41
+ Misses 2146 2142 -4 ☔ View full report in Codecov by Sentry. |
Codecov appears to not be testing what happens in the child process after fork() and the error path of “cannot open |
AFAIR unit tests do not cover the PAM handling part, as they are not running as a system service, test runners don't have necessary PAM configuration etc. |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025021620-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025020404-4.3&flavor=update
Failed tests28 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/127852#dependencies 29 fixed
Unstable testsPerformance TestsPerformance degradation:No issues Remaining performance tests:72 tests
|
This is the only qrexec PR in this test run, so the above failures seems to be regression caused by this change. |
e123ed7
to
c7a7826
Compare
libqrexec/libqrexec-utils.h
Outdated
@@ -162,6 +162,11 @@ void buffer_append(struct buffer *b, const char *data, int len); | |||
void buffer_remove(struct buffer *b, int len); | |||
int buffer_len(struct buffer *b); | |||
void *buffer_data(struct buffer *b); | |||
/* Open /dev/null and keep it from being closed before the exec func is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it simpler (and safer) to simply open /dev/null just before dup-ing it over 0,1,2 (in the child process already)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s definitely simpler, but I didn’t want the extra call to open()
. That said, giving each child process the same open file description is less than great, as it introduces shared state that should not be there. I don’t know if /dev/null
has any such state, but still it isn’t great, so I went ahead and switched to your approach.
But also, I question usefulness of this PR as a whole - the closing of standard FDs happens in a process that has a single purpose - wait for the child process and then exit, in the very same function as closing happens. There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years). |
Whether or not the last commit in the PR is merged, I definitely think the other commits should be merged. In particular, it turned out that the “close the FD” functionality had no unit tests because the unit tests took a codepath that was too different from the production code. This PR makes the production and test code follow the same path, with the result that the actual bug (
PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default. PAM cleanup is used for e.g. unmounting filesystems and closing encrypted volumes. The best approach would be for PAM to run with stdin pointed at |
Indeed I was talking about the last commit (which until the last force-push was the only commit in this PR).
Aren't PAM modules expected handle proper logging themselves? I don't think they are supposed to touch calling process's stdin/out/err in any case. And if they would do, that likely would interfere also with cases where they aren't closed (and then replaced with with unrelated thing) - for example it could interfere with an application log file on stderr that is expected in a specific format (different than PAM messages). |
As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)? |
I can include them in #141 or rebase this PR on top of it. I can also close this PR if you prefer, but I’d prefer that at least the bug fixes and testability changes go in. |
c7a7826
to
4b941e6
Compare
fa8b528
to
61a0261
Compare
A lot of tests has failed, see openqa report from the bot |
I’ll work on fixing this. |
61a0261
to
db5faa9
Compare
I did some tests locally and it looks like the existing code already has a problem (one of the tests fails intermittently when I run it in a loop). |
Maybe there are some other problems too, but clearly several tests reliably fail with this PR, while they reliably pass without it. Do you want access to a VM with this PR installed? |
I already have one, and I can indeed reproduce the bug. It appears that at least some tests fail more often with this PR than without it, but they fail occasionally even without the PR, as shown by a The way I found this is that I tried to bisect the bug. I found 9b2e8fc (v4.3.3 + bug fixes) was bad if I ran one test in a loop, which in turn led me to try 63e0699 (v4.3.3 itself) which also failed. |
Recent run with db5faa9 has only one failure (1 failed out of 4 times this test was run) I tried to get more logs, but it isn't very interesting:
I don't see actual reason for the failure here... |
These should never happen, but call exit() if they do. Also avoid freeing an uninitialized PAM handle in such an error case. I do not consider this a security vulnerability because there is no reasonable way I know of for an attacker to trigger this failure, but this commit should still be backported.
Saves an (admittedly cheap) system call.
No functional change intended.
This will be used by tests later. No functional change intended.
This will be used by tests later. No functional change intended.
This also fixes a bug: basename can mutate its argument, so a copy must be passed to it.
This makes the unit test code more like the actual code used by end-users, and therefore makes the tests more accurate. This trips a bug in the code which will be fixed later, requiring a test to be changed to compensate.
If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead. Also statically initialize trigger_fd to -1, which is the conventional value for an invalid file descriptor. This requires care to avoid closing the file descriptor to /dev/null in fix_fds(), which took over an hour to debug.
db5faa9
to
ca34aa7
Compare
If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead.